<html>
<head>
<link rel="shortcut icon" href="./favicon.ico">
<link rel="stylesheet" type="text/css" href="./style.css">
<link rel="canonical" href="./Word_Reducer.html">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description" content="Reduces multiple words into a single word, using the given Boolean operation. Put differently: it's a [bit-reduction](./Bit_Reducer.html) of each bit position across all words.  The `words_in` input contains all the input words concatenated one after the other.">
<title>Word Reducer</title>
</head>
<body>

<p class="inline bordered"><b><a href="./Word_Reducer.v">Source</a></b></p>
<p class="inline bordered"><b><a href="./legal.html">License</a></b></p>
<p class="inline bordered"><b><a href="./index.html">Index</a></b></p>

<h1>Boolean Word Reducer</h1>
<p>Reduces multiple words into a single word, using the given Boolean
 operation. Put differently: it's a <a href="./Bit_Reducer.html">bit-reduction</a> of
 each bit position across all words.  The <code>words_in</code> input contains all the
 input words concatenated one after the other.</p>
<p>A common use case is to compute multiple results and their selecting
 conditions in parallel, then <a href="./Annuller.html">annul</a> all but the result you
 want and OR-reduce them into a single result. Or don't annul the results,
 but NAND them to see each bit position where the results disagree, and then
 maybe bit-reduce <em>that</em> to signal if <em>any</em> of the results disagree,
 possibly signalling an error.</p>

<pre>
`default_nettype none

module <a href="./Word_Reducer.html">Word_Reducer</a>
#(
    parameter OPERATION  = "",
    parameter WORD_WIDTH = 0,
    parameter WORD_COUNT = 0,

    // Don't change at instantiation
    parameter TOTAL_WIDTH = WORD_WIDTH * WORD_COUNT
)
(
    input   wire    [TOTAL_WIDTH-1:0]   words_in,
    output  wire    [WORD_WIDTH-1:0]    word_out
);

    localparam BIT_ZERO  = {WORD_COUNT{1'b0}};
</pre>

<p>Instantiate the following hardware once for each bit position in a word.
 The <code>bit_word</code> gathers the bit at a given position from all the words.
 (e.g.: all the first bits, all the second bits, etc...) Then, for each
 word, extract the given bit position into the <code>bit_word</code>.</p>

<pre>
    generate

        genvar i, j;

        for (j=0; j < WORD_WIDTH; j=j+1) begin : per_bit

            reg [WORD_COUNT-1:0] bit_word = BIT_ZERO;

            for (i=0; i < WORD_COUNT; i=i+1) begin : per_word
                always @(*) begin
                    bit_word[i] = words_in[(WORD_WIDTH*i)+j];
                end
            end
</pre>

<p>Then reduce the <code>bit_word</code> into the output bit using the specified Boolean
 function.  (i.e.: all input words first bits, gathered into <code>bit_word</code>,
 reduce to the first output word bit).  I use the
 <a href="./Bit_Reducer.html">Bit_Reducer</a> here to both express that word reduction
 is a composition of bit reduction, and to avoid having to rewrite each
 possible case along with the special linter directives to avoid width
 warnings.</p>
<p>The downside is that the list of possible operations is not visible here,
 but if you need to find them out, then reading the bit reducer code is the
 best documentation. And if you need to add an operation, then the word
 reducer code remains unchanged.</p>

<pre>
            <a href="./Bit_Reducer.html">Bit_Reducer</a>
            #(
                .OPERATION      (OPERATION),
                .INPUT_COUNT    (WORD_COUNT)
            )
            bit_position
            (
                .bits_in        (bit_word),
                .bit_out        (word_out[j])
            );
        end

    endgenerate

endmodule
</pre>

<h2>Alternate Implementation</h2>
<p>There exists an alternate implementation of word reduction which is
 differently elegant, but has a couple of pitfalls and cannot re-use the bit
 reducer code. I'll outline it here because it uses looped partial
 calculations with a peeled-out first iteration, which is a common code
 pattern.</p>
<p>Repeatedly using a register in an unclocked loop expresses a combinational
 logic loop, which must be avoided: without special effort the CAD tool
 cannot analyze it for timing, or sometimes even synthesize it. So we create
 an array of registers to hold each partial result, and initialize them to
 zero.</p>
<pre><code>reg [WORD_WIDTH-1:0] partial_reduction [WORD_COUNT-1:0];

integer i;

initial begin
    for(i=0; i &lt; WORD_COUNT; i=i+1) begin
        partial_reduction[i] = ZERO;
    end
end
</code></pre>
<p>First, connect the zeroth input word to the zeroth partial result.  This
 peels out the first loop iteration, where the read index would be out of
 range (negative!) otherwise.</p>
<pre><code>always @(*) begin
    partial_reduction[0] = in[0 +: WORD_WIDTH];
</code></pre>
<p>Then OR the previous partial result with the current input word, creating
 the next partial result. Note the start index because of the peeled-out
 first iteration: <code>i=1</code>.  This is where you would implement each possible
 operation, and most of the code would be duplicated boilerplate, differing
 only by the Boolean operator. This is dull, error-prone, and drags in
 synthesis-time complications, such as linter directives and operation
 selection, into the middle of run-time code.</p>
<pre><code>    for(i=1; i &lt; WORD_COUNT; i=i+1) begin
        partial_reduction[i] = partial_reduction[i-1] | words_in[WORD_WIDTH*i +: WORD_WIDTH];
    end
</code></pre>
<p>The last partial result is the final result.</p>
<pre><code>    word_out = partial_reduction[WORD_COUNT-1];
end
</code></pre>

<hr>
<p><a href="./index.html">Back to FPGA Design Elements</a>
<center><a href="https://fpgacpu.ca/">fpgacpu.ca</a></center>
</body>
</html>

